Introduction

Understanding the factors that drive a country’s total CO₂ emissions is critical for designing effective climate policies. The Kaya identity decomposes emissions into four drivers—population, economic activity (GDP per capita), energy efficiency (energy per unit GDP), and emissions intensity (CO₂ per unit energy)—allowing us to quantify each factor’s relative influence on emissions trends. In this project, we apply the Kaya identity regression across multiple countries to answer:

  • Which drivers most strongly explain cross-country differences in CO₂ output?
  • How do population, wealth, energy intensity, and emissions efficiency vary in their impact?
  • Are there notable outliers that deviate from expected efficiency patterns?

By examining a global sample, we aim to uncover where policy interventions (e.g., improving energy efficiency or shifting fuel mixes) may yield the greatest emissions reductions.

Data Preparation

Load libraries

##Install libraries
library(readr)
library(readxl)
library(dplyr)
## 
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
## 
##     filter, lag
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union
library(tidyr)
library(purrr)
library(ggplot2)
library(coefplot)

Importing Country Codes Data

Total primary energy sources

##  [1] "Coal"          "Peat"          "Crude"         "Oil_shale"    
##  [5] "Oil"           "Natural_gas"   "Nuclear"       "Hydro"        
##  [9] "Geothermal"    "Solar_wind"    "Biofuels"      "Heat_non_spec"
## [13] "Electricity"   "Heat"          "Total"         "Check"

Total final consumption (TJ)

##  [1] "Industry"                "Transport"              
##  [3] "Residential"             "Commerical"             
##  [5] "Agriculture"             "Fishing"                
##  [7] "Not_specified"           "Non-energy"             
##  [9] "Total_final_Consumption" "Check"

CO2 from world bank for comparing

Population (person)

GDP

Reshaping data

Analysis part

regression

f_tot should include coal, natural gas, and crude oil

co2 ~ p + (g/p) + (e/g) + f_tot/e

model this for each country/iso3

## 
## Call:
## lm(formula = kayaFormula, data = worldScale)
## 
## Coefficients:
## (Intercept)            p          g_p          e_g          f_e  
##    28246.05      1887.98      2202.29       -91.37       964.33

### Table with countries and their coefficients

Plotting scatter maps

### With all countries

Inference

## In this analysis, we found that across countries, differences in CO₂ emissions are most strongly driven by how efficiently energy is used (energy per unit GDP) and by how carbon‐intensive that energy is (CO₂ per unit energy). Population size and GDP per person also matter, but their impact is generally smaller. This suggests that the biggest gains in cutting emissions worldwide will come from policies that improve energy efficiency and shift to cleaner, lower‐carbon energy sources. We also see a few outliers—countries that emit much more or much less than expected given their size and wealth—highlighting places where targeted improvements could be especially effective.